Search Results: "alexander"

18 June 2017

Alexander Wirt: alioth needs your help

It may look that the decision for pagure as alioth replacement is already finalized, but that s not really true. I got a lot of feedback and tips in the last weeks, those made postpone my decision. Several alternative systems were recommended to me, here are a few examples: and probably several others. I won t be able to evaluate all of those systems in advance of our sprint. That s where you come in: if you are familiar with one of those systems, or want to get familiar with them, join us on our mailing list and create a wiki page below https://wiki.debian.org/Alioth/GitNext with a review of your system. What do we need to know? If you want to start on such a review, please announce it on the mailinglist. If you have questions, ask me on IRC, Twitter or mail. Thanks for your help!

17 June 2017

Alexander Wirt: Survey about alioth replacement

To get some idea about the expectations and current usage of alioth I created a survey. Please take part in it if you are an alioth user. If you need some background about the coming alioth replacement I recommend to read the great lwn article written by anarcat.

14 June 2017

Antoine Beaupr : Alioth moving toward pagure

Since 2003, the Debian project has been running a server called Alioth to host source code version control systems. The server will hit the end of life of the Debian LTS release (Wheezy) next year; that deadline raised some questions regarding the plans for the server over the coming years. Naturally, that led to a discussion regarding possible replacements. In response, the current Alioth maintainer, Alexander Wirt, announced a sprint to migrate to pagure, a free-software "Git-centered forge" written in Python for the Fedora project, which LWN covered last year. Alioth currently runs FusionForge, previously known as GForge, which is the free-software fork of the SourceForge code base when that service closed its source in 2001. Alioth hosts source code repositories, mainly Git and Subversion (SVN) and, like other "forge" sites, also offers forums, issue trackers, and mailing list services. While other alternatives are still being evaluated, a consensus has emerged on a migration plan from FusionForage to a more modern and minimal platform based on pagure.

Why not GitLab? While this may come as a surprise to some who would expect Debian to use the more popular GitLab project, the discussion and decision actually took place a while back. During a lengthy debate last year, Debian contributors discussed the relative merits of different code-hosting platforms, following the initiative of Debian Developer "Pirate" Praveen Arimbrathodiyil to package GitLab for Debian. At that time, Praveen also got a public GitLab instance running for Debian (gitlab.debian.net), which was sponsored by GitLab B.V. the commercial entity behind the GitLab project. The sponsorship was originally offered in 2015 by the GitLab CEO, presumably to counter a possible move to GitHub, as there was a discussion about creating a GitHub Organization for Debian at the time. The deployment of a Debian-specific GitLab instance then raised the question of the overlap with the already existing git.debian.org service, which is backed by Alioth's FusionForge deployment. It then seemed natural that the new GitLab instance would replace Alioth. But when Praveen directly proposed to move to GitLab, Wirt stepped in and explained that a migration plan was already in progress. The plan then was to migrate to a simpler gitolite-based setup, a decision that was apparently made in corridor discussions surrounding the Alioth Git replacement BoF held during Debconf 2015. The first objection raised by Wirt against GitLab was its "huge number of dependencies". Another issue Wirt identified was the "open core / enterprise model", preferring a "real open source system", an opinion which seems shared by other participants on the mailing list. Wirt backed his concerns with an hypothetical example:
Debian needs feature X but it is already in the enterprise version. We make a patch and, for commercial reasons, it never gets merged (they already sell it in the enterprise version). Which means we will have to fork the software and keep those patches forever. Been there done that. For me, that isn't acceptable.
This concern was further deepened when GitLab's Director of Strategic Partnerships, Eliran Mesika, explained the company's stewardship policy that explains how GitLab decides which features end up in the proprietary version. Praveen pointed out that:
[...] basically it boils down to features that they consider important for organizations with less than 100 developers may get accepted. I see that as a red flag for a big community like debian.
Since there are over 600 Debian Developers, the community seems to fall within the needs of "enterprise" users. The features the Debian community may need are, by definition, appropriate only to the "Enterprise Edition" (GitLab EE), the non-free version, and are therefore unlikely to end up in the "Community Edition" (GitLab CE), the free-software version. Interestingly, Mesika asked for clarification on which features were missing, explaining that GitLab is actually open to adding features to GitLab CE. The response from Debian Developer Holger Levsen was categorical: "It's not about a specific patch. Free GitLab and we can talk again." But beyond the practical and ethical concerns, some specific features Debian needs are currently only in GitLab EE. For example, debian.org systems use LDAP for authentication, which would obviously be useful in a GitLab deployment; GitLab CE supports basic LDAP authentication, but advanced features, like group or SSH-key synchronization, are only available in GitLab EE. Wirt also expressed concern about the Contributor License Agreement that GitLab B.V. requires contributors to sign when they send patches, which forces users to allow the release of their code under a non-free license. The debate then went on going through a exhaustive inventory of different free-software alternatives:
  • GitLab, a Ruby-based GitHub replacement, dual-licensed MIT/Commercial
  • Gogs, Go, MIT
  • Gitblit, Java, Apache-licensed
  • Kallithea, in Python, also supports Mercurial, GPLv3
  • and finally, pagure, also written Python, GPLv2
A feature comparison between each project was created in the Debian wiki as well. In the end, however, Praveen gave up on replacing Alioth with GitLab because of the controversy and moved on to support the pagure migration, which resolved the discussion in July 2016. More recently, Wirt admitted in an IRC conversation that "on the technical side I like GitLab a lot more than pagure" and that "as a user, GitLab is much nicer than pagure and it has those nice CI [continuous integration] features". However, as he explained in his blog "GitLab is Opencore, [and] that it is not entirely opensource. I don't think we should use software licensed under such a model for one of our core services" which leaves pagure as the only stable candidate. Other candidates were excluded on technical grounds, according to Wirt: Gogs "doesn't scale well" and a quick security check didn't yield satisfactory results; "Gitblit is Java" and Kallithea doesn't have support for accessing repositories over SSH (although there is a pending pull request to add the feature). In an email interview, Sid Sijbrandij, CEO of GitLab, did say that "we want to make sure that our open source edition can be used by open source projects". He gave examples of features liberated following requests by the community, such as branded login pages for the VLC project and GitLab Pages after popular demand. He stressed that "There are no artificial limits in our open source edition and some organizations use it with more than 20.000 users." So if the concern of the Debian community is that features may be missing from GitLab CE, there is definitely an opening from GitLab to add those features. If, however, the concern is purely ethical, it's hard to see how an agreement could be reached. As Sijbrandij put it:
On the mailinglist it seemed that some Debian maintainers do not agree with our open core business model and demand that there is no proprietary version. We respect that position but we don't think we can compete with the purely proprietary software like GitHub with this model.

Working toward a pagure migration The issue of Alioth maintenance came up again last month when Boyuan Yang asked what would happen to Alioth when support for Debian LTS (Wheezy) ends next year. Wirt brought up the pagure migration proposal and the community tried to make a plan for the migration. One of the issues raised was the question of the non-Git repositories hosted on Alioth, as pagure, like GitLab, only supports Git. Indeed, Ben Hutchings calculated that while 90% (\~19,000) of the repositories currently on Alioth are Git, there are 2,400 SVN repositories and a handful of Mercurial, Bazaar (bzr), Darcs, Arch, and even CVS repositories. As part of an informal survey, however, most packaging teams explained they either had already migrated away from SVN to Git or were in the process of doing so. The largest CVS user, the web site team, also explained it was progressively migrating to Git. Mattia Rizzolo then proposed that older repository services like SVN could continue running even if FusionForge goes down, as FusionForge is, after all, just a web interface to manage those back-end services. Repository creation would be disabled, but older repositories would stay operational until they migrate to Git. This would, effectively, mean the end of non-Git repository support for new projects in the Debian community, at least officially. Another issue is the creation of a Debian package for pagure. Ironically, while Praveen and other Debian maintainers have been working for 5 years to package GitLab for Debian, pagure isn't packaged yet. Antonio Terceiro, another Debian Developer, explained this isn't actually a large problem for debian.org services: "note that DSA [Debian System Administrator team] does not need/want the service software itself packaged, only its dependencies". Indeed, for Debian-specific code bases like ci.debian.net or tracker.debian.org, it may not make sense to have the overhead of maintaining Debian packages since those tools have limited use outside of the Debian project directly. While Debian derivatives and other distributions could reuse them, what usually happens is that other distributions roll their own software, like Ubuntu did with the Launchpad project. Still, Paul Wise, a member of the DSA team, reasoned that it was better, in the long term, to have Debian packages for debian.org services:
Personally I'm leaning towards the feeling that all configuration, code and dependencies for Debian services should be packaged and subjected to the usual Debian QA activities but I acknowledge that the current archive setup (testing migration plus backporting etc) doesn't necessarily make this easy.
Wise did say that "DSA doesn't have any hard rules/policy written down, just evaluation on a case-by-case basis" which probably means that pagure packaging will not be a blocker for deployment. The last pending issue is the question of the mailing lists hosted on Alioth, as pagure doesn't offer mailing list management (nor does GitLab). In fact, there are three different mailing list services for the Debian project: Wirt, with his "list-master hat" on, explained that the main mailing list service is "not really suited as a self-service" and expressed concern at the idea of migrating the large number mailing lists hosted on Alioth. Indeed, there are around 1,400 lists on Alioth while the main service has a set of 300 lists selected by the list masters. No solution for those mailing lists was found at the time of this writing. In the end, it seems like the Debian project has chosen pagure, the simpler, less featureful, but also less controversial, solution and will use the same hosting software as their fellow Linux distribution, Fedora. Wirt is also considering using FreeIPA for account management on top of pagure. The plan is to migrate away from FusionForge one bit at a time, and pagure is the solution for the first step: the Git repositories. Lists, other repositories, and additional features of FusionForge will be dealt with later on, but Wirt expects a plan to come out of the upcoming sprint. It will also be interesting to see how the interoperability promises of pagure will play out in the Debian world. Even though the federation features of pagure are still at the early stages, one can already clone issues and pull requests as Git repositories, which allows for a crude federation mechanism. In any case, given the long history and the wide variety of workflows in the Debian project, it is unlikely that a single tool will solve all problems. Alioth itself has significant overlap with other Debian services; not only does it handle mailing lists and forums, but it also has its own issue tracker that overlaps with the Debian bug tracking system (BTS). This is just the way things are in Debian: it is an old project with lots of moving part. As Jonathan Dowland put it: "The nature of the project is loosely-coupled, some redundancy, lots of legacy cruft, and sadly more than one way to do it." Hopefully, pagure will not become part of that "legacy redundant cruft". But at this point, the focus is on keeping the services running in a simpler, more maintainable way. The discussions between Debian and GitLab are still going on as we speak, but given how controversial the "open core" model used by GitLab is for the Debian community, pagure does seem like a more logical alternative.
Note: this article first appeared in the Linux Weekly News.

13 June 2017

Reproducible builds folks: Reproducible Builds: week 111 in Stretch cycle

Here's what happened in the Reproducible Builds effort between Sunday June 4 and Saturday June 10 2017: Past and upcoming events On June 10th, Chris Lamb presented at the Hong Kong Open Source Conference 2017 on reproducible builds. Patches and bugs filed Reviews of unreproducible packages 7 package reviews have been added, 10 have been updated and 14 have been removed in this week, adding to our knowledge about identified issues. Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: Two FTBFS issues of LEDE (exposed in our setup) were found and were fixed: diffoscope development tests.reproducible-builds.org: Alexander 'lynxis' Couzens made some changes for testing LEDE and OpenWrt: Hans-Christoph Steiner, for testing F-Droid: Daniel Shahaf, for testing Debian: Holger 'h01ger' Levsen, for testing Debian: Misc. This week's edition was written by Ximin Luo, Chris Lamb and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

7 June 2017

Alexander Wirt: Upcoming Alioth Sprint

As some of you already know we do need a replacement for alioth.debian.org. It is based on wheezy and a heavily modified version of Fusionforge. Unfortunately I am the last admin left for alioth and I am not really familiar with fusionforge. After some chatting with a bunch of people we decided that we should replace alioth with a stripped down version of new services. We want to start with the basic things, git and and identity provider. For git there are two candidates: gitlab and pagure. Gitlab is really nice, but has a big problem: it is Opencore, which that it is not entirely opensource. I don t think we should use software licensed under such a model for one of our core services. That brings us to the last candidate: pagure. Pagure is a nice project based on gitolite, it is developed by the fedora project which use it for all their repos. Pagure isn t packaged for Debian yet, but that is work in progress (#829046). If you can lend a helping hand to the packager, please do so. To get things started we will have a Alioth Sprint from 18th to 20th August 2017 in Hamburg, Germany. If you want to join us, add yourself to the wikipage. For further discussions I created a mailinglist on alioth. Please subscribe if you are interested in that topic.

Alexander Wirt: New blog

After a long time I decided to move my blog again to something with git and markdown in the background. I decided to give hugo a chance on my uberspace account. Uberspace is a nice, geek driven hoster in germany from geeks for geeks. The blog itself is hosted on github, it pushes commits to a simple golang based webhook service based on adnanh/webhook.

30 May 2017

Reproducible builds folks: Reproducible Builds: week 109 in Stretch cycle

Here's what happened in the Reproducible Builds effort between Sunday May 21 and Saturday May 27 2017: Past and upcoming events Bernhard M. Wiedemann gave a short talk on reproducible builds in openSUSE at the openSUSE Conference 2017. Slides and video recordings are available on that page. Chris Lamb will present at the Hong Kong Open Source Conference 2017 on reproducible builds on June 9th. Our next IRC meeting has been scheduled for Thursday June 1 at 16:00 UTC with this agenda. Academia Justin Cappos continued his work on the reproducible builds paper, with text and suggestions from Ximin Luo integrated. Toolchain developments #863470: "ftp.debian.org: security sync must not exclude .buildinfo" - while this bug isn't fixed, you need to make sure not to build jessie updates with stretch's dpkg, or else the upload will be rejected. Ximin Luo built GCC twice and ran diffoscope on them. Unfortunately the results were 1.7 GB in size and it can't be displayed in a web browser. 99/171 of the .debs are reproducible, though. He's now working on diffoscope (see below) to make it generate output more intelligently for such large size diffs. Here is a summary diff where the recursion depth cut-off was set low, so the size is reasonable and one can still see the outlines of where to look next. debuerreotype was newly added to Debian unstable. It is a reproducible, snapshot-based Debian rootfs builder. Patches and bugs filed Reviews of unreproducible packages 29 package reviews have been added, 49 have been updated and 23 have been removed in this week, adding to our knowledge about identified issues. Weekly QA work During our reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development Development continued in git, with commits from: strip-nondeterminism development Version 0.034-1 was uploaded to unstable by Chris Lamb. It included previous weeks' contributions from: tests.reproducible-builds.org: Misc. This week's edition was written by Ximin Luo, Bernhard M. Wiedemann, Chris Lamb and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

12 May 2017

Daniel Pocock: Kamailio World and FSFE team visit, Tirana arrival

This week I've been thrilled to be in Berlin for Kamailio World 2017, one of the highlights of the SIP, VoIP and telephony enthusiast's calendar. It is an event that reaches far beyond Kamailio and is well attended by leaders of many of the well known free software projects in this space. HOMER 6 is coming Alexandr Dubovikov gave me a sneak peek of the new version of the HOMER SIP capture framework for gathering, storing and analyzing messages in a SIP network. exploring HOMER 6 with Alexandr Dubovikov at Kamailio World 2017 Visiting the FSFE team in Berlin Having recently joined the FSFE's General Assembly as the fellowship representative, I've been keen to get to know more about the organization. My visit to the FSFE office involved a wide-ranging discussion with Erik Albers about the fellowship program and FSFE in general. discussing the Fellowship program with Erik Albers Steak and SDR night After a hard day of SIP hacking and a long afternoon at Kamailio World's open bar, a developer needs a decent meal and something previously unseen to hack on. A group of us settled at Escados, Alexanderplatz where my SDR kit emerged from my bag and other Debian users found out how easy it is to apt install the packages, attach the dongle and explore the radio spectrum. playing with SDR after dinner Next stop OSCAL'17, Tirana Having left Berlin, I'm now in Tirana, Albania where I'll give an SDR workshop and Free-RTC talk at OSCAL'17. The weather forecast is between 26 - 28 degrees celsius, the food is great and the weekend's schedule is full of interesting talks and workshops. The organizing team have already made me feel very welcome here, meeting me at the airport and leaving a very generous basket of gifts in my hotel room. OSCAL has emerged as a significant annual event in the free software world and if it's too late for you to come this year, don't miss it in 2018. OSCAL'17 banner

11 April 2017

Antoine Beaupr : A report from Netconf: Day 1

As is becoming traditional, two times a year the kernel networking community meets in a two-stage conference: an invite-only, informal, two-day plenary session called Netconf, held in Toronto this year, and a more conventional one-track conference open to the public called Netdev. I was invited to cover both conferences this year, given that Netdev was in Montreal (my hometown), and was happy to meet the crew of developers that maintain the network stack of the Linux kernel. This article covers the first day of the conference which consisted of around 25 Linux developers meeting under the direction of David Miller, the kernel's networking subsystem maintainer. Netconf has no formal sessions; although some people presented slides, interruptions are frequent (indeed, encouraged) and the focus is on hashing out issues that are blocked on the mailing list and getting suggestions, ideas, solutions, and feedback from their peers.

Removing ndo_select_queue() One of the first discussions that elicited a significant debate was the ndo_select_queue() function, a key component of the Linux polling system that determines when and how to send packets on a network interface (see netdev_pick_tx and friends). The general question was whether the use of ndo_select_queue() in drivers is a good idea. Alexander Duyck explained that Intel people were considering using ndo_select_queue() for receive/transmit queue matching. Intel drivers do not currently use the hook provided by the Linux kernel and it turns out no one is happy with ndo_select_queue(): the heuristics it uses don't really please anyone. The consensus (including from Duyck himself) seemed to be that it should just not be used anymore, or at least not used for that specific purpose. The discussion turned toward the wireless network stack, which uses it extensively, but for other purposes. Johannes Berg explained that the wireless stack uses ndo_select_queue() for traffic classification, for example to get voice traffic through even if the best-effort queue is backed up. The wireless stack could stop using it by doing flow control completely inside the wireless stack, which already uses the fq_codel flow-control mechanism for other purposes, so porting away from ndo_select_queue() seems possible there. The problem then becomes how to update all the drivers to change that behavior, which would be a lot of work. Still, it seems people are moving away from a generic ndo_select_queue() interface to stack-specific or even driver-specific (in the case of Intel) queue management interfaces.

refcount_t followup There was a followup discussion on the integration of the refcount_t type into the network stack, which we covered recently. This type is meant to be an in-kernel defense against exploits based on overflowing or underflowing an object's reference count. The consensus seems to be that having refcount_t used for debugging is acceptable, but it cannot be enabled by default. An issue that was identified is that the networking developers are fairly sure that introducing refcount_t would have a severe impact on performance, but they do not have benchmarks to prove it, something Miller identified as a problem that needs to be worked on. Miller then expressed some openness to the idea of having it as a kernel configuration option. A similar discussion happened, on the second day, regarding the KASan memory error detector which was covered when it was introduced in 2014. Eric Dumazet warned that there could be a lot of issues that cannot be detected by KASan because of the way the network stack often bypasses regular memory-allocation routines for performance reasons. He also noted that this can sometimes mean the stack may go over the regular 10% memory limit (the tcp_mem parameter, described in the tcp(7) man page) for certain operations, especially when rebuilding out of order packets with lots of parallel TCP connections. Therefore it was proposed that these special memory recycling tricks could be optionally disabled, at run or compile-time, to instrument proper memory tracking. Dumazet argued this was a situation similar to refcount_t in that we need a way to disable high performance to make the network stack easier to debug with KAsan. The problem with optional parameters is that they are often disabled in production or even by default, which, in turn, means that critical bugs cannot actually be found because the code paths are not tested. When I asked Dumazet about this, he explained that Google performs integration testing of new kernels before putting them in production, and those toggles could be enabled there to find and fix those bugs. But he agreed that certain code paths are then not tested until the code gets deployed in production. So it seems the status quo remains: security folks wants to improve the reliability of the kernel, but the network folks can't afford the performance cost. Yet it was clear in the discussions that the team cares about security issues and wants those issues to be fixed; the impact of some of the solutions is just too big.

Lightweight wireless management packet access Berg explained that some users need to have high-performance access to certain management frames in the wireless stack and wondered how to best expose those to user space. The wireless stack already allows users to clone a network interface in "monitor" mode, but this has a big performance cost, as the radiotap header needs to be constructed from scratch and the packet header needs to be copied. As wireless improves and the bandwidth rises to gigabit levels, this can become significant bottleneck for packet sniffers or reporting software that need to know precisely what's going on over the air outside of the regular access point client operation. It seems the proper way to do this is with an eBPF program. As Miller summarized, just add another API call that allows loading a BPF program into the kernel and then those users can use a BPF filtering point to get the statistics they need. This will require an extra hook in the wireless stack, but it seems like this is the way that will be taken to implement this feature.

VLAN 0 inconsistencies Hannes Frederic Sowa brought up the seemingly innocuous question of "how do we handle VLAN 0?" In theory, VLAN 0 means "no VLAN". But the Linux kernel currently handles this differently depending on whether the VLAN module is loaded and whether a VLAN 0 interface was created. Sometimes the VLAN tag is stripped, sometimes not. It turns out the semantics of this were accidentally changed last time there was a change here and this was originally working but is now broken. Sowa therefore got the go-ahead to fix this to make the behavior consistent again.

Loopy fun Then it came the turn of Jamal Hadi Salim, the maintainer of the kernel's traffic-control (tc) subsystem. The first issue he brought up is a problem in the tc REDIRECT action that can create infinite loops within the kernel. The problem can be easily alleviated when loops are created on the same interface: checks can be added that just drop packets coming from the same device and rate-limit logging to avoid a denial-of-service (DoS) condition. The more serious problem occurs when a packet is forwarded from (say) interface eth0 to eth1 which then promptly redirects it from eth1 back to eth0. Obviously, this kind of problem can only be created by a user with root access so, at first glance, those issues don't seem that serious: admins can shoot themselves in the foot, so what? But things become a little more serious when you consider the container case, where an untrusted user has root access inside a container and should have constrained resource limitations. Such a loop could allow this user to deploy an effective DoS attack against a whole group of containers running on the same machine. Even worse, this endless loop could possibly turn into a deadlock in certain scenarios, as the kernel could try to transmit the packet on the same device it originated from and block, progressively filling the queues and eventually completely breaking network access. Florian Westphal argued that a container can already create DoS conditions, for example by doing a ping flood. According to Salim, this whole problem was created when two bits used for tracking such packets were reclaimed from the skb structure used to represent packets in the kernel. Those bits were a simple TTL (time to live) field that was incremented on each loop and dropped after a pre-determined limit was reached, breaking infinite loops. Salim asked everyone if this should be fixed or if we should just forget about this issue and move on. Miller proposed to keep a one-behind state for the packet, fixing the simplest case (two interfaces). The general case, however, would requite a bitmap of all the interfaces to be scanned, which would impose a large overhead. Miller said an attempt to fix this should somehow be made. The root of the problem is that the network maintainers are trying to reduce the size of the skb structure, because it's used in many critical paths of the network stack. Salim's position is that, without the TTL fields, there is no way to fix the general case here, and this constitutes a security issue. So either the bits need to be brought back, or we need to live with the inherent DoS threat.

Dumping large statistics sets Another issue Salim brought up was the question of how to export large statistics sets from the kernel. It turns out that some use cases may end up dumping a lot of data. Salim mentioned a real-world tc use case that calls for reading six-million entries. The current netlink-based API provides a way to get only 20 entries at a time, which means it takes forever to dump the state of all those policy actions. Salim has a patch that changes the dump size be eight times the NLMSG_GOOD_SIZE, which improves performance by an order of magnitude already, although there are issues with checking the user-space buffer size there. But a more complete solution is needed. What Salim proposed was a way to ask only for the states that changed since the last dump was requested. He has a patch to add a last_access field to the netlink_callback structure used by netlink_dump() to output data; that raised the question of how to actually use that field. Since Salim fetches that data every five seconds, he figured he could just tell the kernel to return all the nodes that changed in that period. But then if the dump takes more than five seconds to complete, the next dump may be missing states that changed during the extra delay. An alternative mechanism would be for the user-space utility to keep the time stamp it requested and use that as a delta for the next dump. It turns out this is a larger problem than just tc. Dumazet mentioned this was an issue with fq_codel classes: he would even like to be able to dump those statistics faster than every five seconds. Roopa Prabhu mentioned that Cumulus also has similar problems dumping stats from bridges, so clearly a more generic solution is needed here. There is, however, a fundamental problem with dumping large statistics sets from the kernel: those statistics are constantly changing while the dump is created and unless versioning or locking mechanisms are used which would slow things down the data returned is bound to be only an approximation of reality. Salim promised to send a set of RFC patches to further discussions regarding this issue, but during the following Netdev conference, Berg published a patch to fix this ten-year-old issue, which brought cheers from the audience.
The author would like to thank the Netconf and Netdev organizers for travel to, and hosting assistance in, Toronto. Many thanks to Berg, Dumazet, Salim, and Sowa for their time taken for a technical review of this article. Note: this article first appeared in the Linux Weekly News.

Antoine Beaupr : A report from Netconf: Day 2

This article covers the second day of the informal Netconf discussions, held on on April 4, 2017. Topics discussed this day included the binding of sockets in VRF, identification of eBPF programs, inconsistencies between IPv4 and IPv6, changes to data-center hardware, and more. (See this article for coverage from the first day of discussions).

How to bind to specific sockets in VRF One of the first presentations was from David Ahern of Cumulus, who presented a few interesting questions for the audience. His first was the problem of binding sockets to a given interface. Right now, there are four different ways this can be done:
  • the old SO_BINDTODEVICE generic socket option (see socket(7))
  • the IP_PKTINFO, IP-specific socket option (see ip(7)), introduced in Linux 2.2
  • the IP_UNICAST_IF flag, introduced in Linux 3.3 for WINE
  • the IPv6 scope ID suffix, part of the IPv6 addressing standard
So there's a problem of having too many ways of doing the same thing, something that cannot really be fixed without breaking ABI compatibility. But even worse, conflicts between those options are not reported by the kernel so it's possible for a user to set up socket flags in a way that certain flags override others and there are no checks made or errors reported. It was agreed that the user should get some notification of conflicting changes here, at least. Furthermore, binding sockets to a specific VRF (Virtual Routing and Forwarding) device is not currently possible, so Ahern asked what the best way to do this would be, considering the many options available. A use case example is a UDP multicast socket that could be bound to a specific interface within a VRF. This is an old problem: Tom Herbert explained that there were previous discussions about making the bind() system call more programmable so that, for example, you could bind() a UDP socket to a discrete list of IP addresses or a subnet. So he identified this issue as a broader problem that should be addressed by making the interfaces more generic. Ahern explained that it is currently possible to bind sockets to the slave device of a VRF even though that should not be allowed. He also raised the question of how the kernel should tell which socket should be selected for incoming packets. Right now, there is a scoring mechanism for UDP sockets, but that cannot be used directly in this more general case. David Miller said that there are already different ways of specifying scope: there is the VRF layer and the namespace ("netns") layer. A long time ago, Miller reluctantly accepted the addition of netns keys everywhere, swallowing the performance cost to gain flexibility. He argued that a new key should not be added and instead existing infrastructure should be reused. Herbert argued this was exactly the reason why this should be simplified: "if we don't answer the question, people will keep on trying this". For example, one can use a VRF to limit listening addresses, but it gets complicated if we need a device for every address. It seems the consensus evolved towards using, IP_UNICAST_IF, added back in 2012, which is accessible for non-root users. It is currently limited to UDP and RAW sockets, but it could be extended for TCP.

XDP and eBPF program identification Ahern then turned to the problem of extracting BPF programs from the kernel. He gave the example of a simple cBPF (classic BPF) filter that checks for ARP packets. If the filter is read back from the kernel, the user gets a blob of binary data, which is hard to interpret. There is an kernel verifier that can show C-like output, but that is also difficult to interpret. Ahern then added annotations to his slide that showed what the original program actually does, which was a good demonstration of why such a feature is needed. Ahern explained that, at least for cBPF, it should be possible to recover the original plaintext, or at least something close to the original program. A first step would be to replace known constants (like 0x806 for ARP). Even with eBPF, it should be possible to improve the output. Alexei Starovoitov, the BPF maintainer, explained that it might make sense to start by returning information about the maps used by an eBPF program. Then more complex data structures could be inspected once we know their type. The first priority is to get simple debugging tools working but, in the long term, the goal is a full decompiler that can reconstruct instructions into a human-readable program. The question that remains is how to return this data. Ahern explained that right now the bpf() system call copies the data to a different file descriptor, but it could just fill in a buffer. Starovoitov argued for a file descriptor; that would allow the kernel to stream everything through the same descriptor instead of having many attach points. Netlink cannot be used for this because of its asynchronous nature. A similar issue regarding the way we identify express data path (XDP) programs (which are also written in BPF) was raised by Daniel Borkmann from Covalent. Miller explained that users will want ways to figure out which XDP program was installed, so XDP needs an introspection mechanism. We currently have SHA-1 identifiers that can be internally used to tell which binary is currently loaded but those are not exposed to user space. Starovoitov mentioned it is now just a boolean that shows if a program is loaded or not. A use case for this, on top of just trying to figure out which BPF program is loaded, is to actually fetch the source code of a BPF program that was deployed in the field for which the source was lost. It is still uncertain that it will be possible to extract an exact copy that could then be recompiled into the same program. Starovoitov added that he needed this in production to do proper reporting.

IPv4/IPv6 equivalency The last issue or set of issues that Ahern brought up was the question of inconsistencies between IPv4 and IPv6. It turns out that, because both protocols were (naturally) implemented separately, there are inconsistencies in how they are handled in the Linux kernel, which affect, among other things, the VRF framework. The first example he gave was the fact that IPv6 addresses added on the loopback interface generate unreachable routes in the main routing table, yet this doesn't happen with IPv4 addresses. Hannes Frederic Sowa explained this was part of the IPv6 specification: there are stronger restrictions on loopback interfaces in IPv6 than IPv4. Ahern explained that VRF loopback interfaces do not implement these restrictions and wanted to know if this was a problem. Another issue is that anycast routes are added to the wrong interface. This is apparently not specific to VRF: this was done "just because Java", and has been there from day one. It seems that the Java Virtual Machine builds its own routing table and assumes this behavior, so changing this would break every JVM out there, which is obviously not acceptable. Finally, Martin Kafai Lau asked if work should be done to merge the IPv4 and IPv6 FIB (forwarding information base) trees. The FIB tree is the data structure that represents routing tables in the Linux kernel. Miller explained that the two trees are not semantically equivalent: while IPv6 does source-address lookup and routing, IPv4 does not. We can't remove the source lookups from IPv6, because "people probably use that". According to Alexander Duyck, adding source tables to IPv4 would degrade performance to the level of IPv6 performance, which was jokingly referred to as an incentive to switch to IPv6. More seriously, Sowa argued that using the same compressed tree IPv4 uses in IPv6 could make sense. People may want to have source routing in IPv4 as well. Miller argued that the kernel is optimized for 32-bit addresses in IPv4, and conceded that it could be scaled to 64-bit subnets, but 128-bit addresses would be much harder. Sowa suggested that they could be limited to 64 bits, as global routes that are announced over BGP usually have such a limit, and more specific routes are usually at discrete prefixes like /65, /127 (for interconnect links) or /128 for (for point-to-point links). He expressed concerns over the reliability of such an implementation so, at this point, it is unlikely that the data structures could be merged. What is more likely is that the code path could be merged and simplified, while keeping the data structures separate.

Modules options substitutions The next issue that was raised was from Ji P rko, who asked how to pass configuration options to a driver before the driver is initialized. Some chips require that some settings be sent before the firmware is loaded, which leads to a weird situation where there is a need to address a device before it's actually recognized by the kernel. The question then can be summarized as to how to pass information to a device that doesn't exist yet. The answer seems to be that devlink could do this, as it has access to the full device tree and, therefore, to devices that can be addressed by (say) PCI identifiers. Then a possible devlink command could look something like:
    devlink dev pci/0000:03:00.0 option set foo bar
This idea raised a bunch of extra questions: some devices don't have a one-to-one mapping with the PCI bridge identifiers, for example, meaning that those identifiers cannot be used to access such devices. Another issue is that you may want to send multiple settings in a single transaction, which doesn't fit well in the devlink model. Miller then proposed to let the driver initialize itself to some state and wait for configuration to be sent when necessary. Another way would be to unregister the driver and re-register with the given configuration. Shrijeet Mukherjee explained that right now, Cumulus is doing this using horrible startup script magic by retrying and re-registering, but it would be nice to have a more standard way to do this.

Control over UAPI patches Another issue that came up was the problem of changes in the user-space API (UAPI) which break backward compatibility. P rko said that "we have to be more careful about those changes". The problem is that reviewers are not always available to make detailed reviews of such changes and may not notice API-breaking changes. P rko proposed creating a bot to check if a given patch introduces UAPI changes, changes in structs, or in netlink enums. Miller said he could block merges until discussions happen and that patchwork, which Miller uses to process patches from the mailing list, does some of this. He also pointed out there aren't enough test cases in the first place. Starovoitov argued UAPI isn't special, there are other ways of breaking backward compatibility. He expressed concerns that such a bot could create a false sense that everything is fine while a patch could break compatibility and not be detected. Miller countered that UAPI is special in that "we're stuck with it forever". He then went on to propose that, since there's a maintainer (or more) for each module, he can make sure that each maintainer explicitly approves changes to those modules.

Data-center hardware changes Starovoitov brought up the issue of a new type of hardware that is currently being deployed in data centers called a "multi-host NIC" (network interface card). It's a single NIC that is connected to multiple servers. Facebook, for example, uses this in its Yosemite platform that shoves twelve servers into a 2U rack mount, in three modules. Each module is made of four servers connected to the traditional switch fabric with a single NIC through PCI-Express. Mellanox and and Broadcom also have similar devices. One question is how to manage those devices. Since they are connected through a PCI-Express bus, Linux will see them as a NIC, yet they are also a little like switches, in that they interconnect multiple servers. Furthermore, the kernel security model assumes that a NIC is trusted, and gladly opens its own memory to NICs through DMA; this can become a huge security issue when the NIC is under the control of another server. This can especially become problematic if we consider that there could be TLS hardware offloading in the future with the introduction of in-kernel TLS stacks. The other problem is the question of reliability: since those devices are currently "dumb", they need to be managed just like a regular NIC. If the host managing the card crashes, it could disable a whole set of servers that rely on the same NIC. There could be an election process among the servers, but that complicates significantly what used to be a simple PCI connection. Mukherjee pointed out that the model Cisco uses for this is that the "smart NIC" is a "slave" of the main switch fabric. It's a daughter card, which makes it easier to manage from a network perspective. It is clear that Linux will need a way to represent those devices, probably through the newly introduced switchdev or DSA (distributed switch architecture), but it will be something to keep an eye on as density increases in the data center. There were many more discussions during Netconf, too many to cover here, but in the end, Miller thanked everyone for all the interesting topics as the participants dispersed for a day off to travel to Montreal to attend the following Netdev conference.
The author would like to thank the Netconf and Netdev organizers for travel to, and hosting assistance in, Toronto. Many thanks to Alexei Starovoitov for his time taken for a technical review of this article. Note: this article first appeared in the Linux Weekly News.

23 March 2017

Simon McVittie: GTK hackfest 2017: D-Bus communication with containers

At the GTK hackfest in London (which accidentally became mostly a Flatpak hackfest) I've mainly been looking into how to make D-Bus work better for app container technologies like Flatpak and Snap. The initial motivating use cases are: At the moment, Flatpak runs a D-Bus proxy for each app instance that has access to D-Bus, connects to the appropriate bus on the app's behalf, and passes messages through. That proxy is in a container similar to the actual app instance, but not actually the same container; it is trusted to not pass messages through that it shouldn't pass through. The app-identification mechanism works in practice, but is Flatpak-specific, and has a known race condition due to process ID reuse and limitations in the metadata that the Linux kernel maintains for AF_UNIX sockets. In practice the use of X11 rather than Wayland in current systems is a much larger loophole in the container than this race condition, but we want to do better in future. Meanwhile, Snap does its sandboxing with AppArmor, on kernels where it is enabled both at compile-time (Ubuntu, openSUSE, Debian, Debian derivatives like Tails) and at runtime (Ubuntu, openSUSE and Tails, but not Debian by default). Ubuntu's kernel has extra AppArmor features that haven't yet gone upstream, some of which provide reliable app identification via LSM labels, which dbus-daemon can learn by querying its AF_UNIX socket. However, other kernels like the ones in openSUSE and Debian don't have those. The access-control (AppArmor mediation) is implemented in upstream dbus-daemon, but again doesn't work portably, and is not sufficiently fine-grained or flexible to do some of the things we'll likely want to do, particularly in dconf. After a lot of discussion with dconf maintainer Allison Lortie and Flatpak maintainer Alexander Larsson, I think I have a plan for fixing this. This is all subject to change: see fd.o #100344 for the latest ideas. Identity model Each user (uid) has some uncontained processes, plus 0 or more containers. The uncontained processes include dbus-daemon itself, desktop environment components such as gnome-session and gnome-shell, the container managers like Flatpak and Snap, and so on. They have the user's full privileges, and in particular they are allowed to do privileged things on the user's session bus (like running dbus-monitor), and act with the user's full privileges on the system bus. In generic information security jargon, they are the trusted computing base; in AppArmor jargon, they are unconfined. The containers are Flatpak apps, or Snap apps, or other app-container technologies like Firejail and AppImage (if they adopt this mechanism, which I hope they will), or even a mixture (different app-container technologies can coexist on a single system). They are containers (or container instances) and not "apps", because in principle, you could install com.example.MyApp 1.0, run it, and while it's still running, upgrade to com.example.MyApp 2.0 and run that; you'd have two containers for the same app, perhaps with different permissions. Each container has an container type, which is a reversed DNS name like org.flatpak or io.snapcraft representing the container technology, and an app identifier, an arbitrary non-empty string whose meaning is defined by the container technology. For Flatpak, that string would be another reversed DNS name like com.example.MyGreatApp; for Snap, as far as I can tell it would look like example-my-great-app. The container technology can also put arbitrary metadata on the D-Bus representation of a container, again defined and namespaced by the container technology. For instance, Flatpak would use some serialization of the same fields that go in the Flatpak metadata file at the moment. Finally, the container has an opaque container identifier identifying a particular container instance. For example, launching com.example.MyApp twice (maybe different versions or with different command-line options to flatpak run) might result in two containers with different privileges, so they need to have different container identifiers. Contained server sockets App-container managers like Flatpak and Snap would create an AF_UNIX socket inside the container, bind() it to an address that will be made available to the contained processes, and listen(), but not accept() any new connections. Instead, they would fd-pass the new socket to the dbus-daemon by calling a new method, and the dbus-daemon would proceed to accept() connections after the app-container manager has signalled that it has called both bind() and listen(). (See fd.o #100344 for full details.) Processes inside the container must not be allowed to contact the AF_UNIX socket used by the wider, uncontained system - if they could, the dbus-daemon wouldn't be able to distinguish between them and uncontained processes and we'd be back where we started. Instead, they should have the new socket bind-mounted into their container's XDG_RUNTIME_DIR and connect to that, or have the new socket set as their DBUS_SESSION_BUS_ADDRESS and be prevented from connecting to the uncontained socket in some other way. Those familiar with the kdbus proposals a while ago might recognise this as being quite similar to kdbus' concept of endpoints, and I'm considering reusing that name. Along with the socket, the container manager would pass in the container's identity and metadata, and the method would return a unique, opaque identifier for this particular container instance. The basic fields (container technology, technology-specific app ID, container ID) should probably be added to the result of GetConnectionCredentials(), and there should be a new API call to get all of those plus the arbitrary technology-specific metadata. When a process from a container connects to the contained server socket, every message that it sends should also have the container instance ID in a new header field. This is OK even though dbus-daemon does not (in general) forbid sender-specified future header fields, because any dbus-daemon that supported this new feature would guarantee to set that header field correctly, the existing Flatpak D-Bus proxy already filters out unknown header fields, and adding this header field is only ever a reduction in privilege. The reasoning for using the sender's container instance ID (as opposed to the sender's unique name) is for services like dconf to be able to treat multiple unique bus names as belonging to the same equivalence class of contained processes: instead of having to look up the container metadata once per unique name, dconf can look it up once per container instance the first time it sees a new identifier in a header field. For the second and subsequent unique names in the container, dconf can know that the container metadata and permissions are identical to the one it already saw. Access control In principle, we could have the new identification feature without adding any new access control, by keeping Flatpak's proxies. However, in the short term that would mean we'd be adding new API to set up a socket for a container without any access control, and having to keep the proxies anyway, which doesn't seem great; in the longer term, I think we'd find ourselves adding a second new API to set up a socket for a container with new access control. So we might as well bite the bullet and go for the version with access control immediately. In principle, we could also avoid the need for new access control by ensuring that each service that will serve contained clients does its own. However, that makes it really hard to send broadcasts and not have them unintentionally leak information to contained clients - we would need to do something more like kdbus' approach to multicast, where services know who has subscribed to their multicast signals, and that is just not how dbus-daemon works at the moment. If we're going to have access control for broadcasts, it might as well also cover unicast. The plan is that messages from containers to the outside world will be mediated by a new access control mechanism, in parallel with dbus-daemon's current support for firewall-style rules in the XML bus configuration, AppArmor mediation, and SELinux mediation. A message would only be allowed through if the XML configuration, the new container access control mechanism, and the LSM (if any) all agree it should be allowed. By default, processes in a container can send broadcast signals, and send method calls and unicast signals to other processes in the same container. They can also receive method calls from outside the container (so that interfaces like org.freedesktop.Application can work), and send exactly one reply to each of those method calls. They cannot own bus names, communicate with other containers, or send file descriptors (which reduces the scope for denial of service). Obviously, that's not going to be enough for a lot of contained apps, so we need a way to add more access. I'm intending this to be purely additive (start by denying everything except what is always allowed, then add new rules), not a mixture of adding and removing access like the current XML policy language. There are two ways we've identified for rules to be added: Initially, many contained apps would work in the first way (and in particular sockets=session-bus would add a rule that allows almost everything), while over time we'll probably want to head towards recommending more use of the second. Related topics Access control on the system bus We talked about the possibility of using a very similar ruleset to control access to the system bus, as an alternative to the XML rules found in /etc/dbus-1/system.d and /usr/share/dbus-1/system.d. We didn't really come to a conclusion here. Allison had the useful insight that the XML rules are acting like a firewall: they're something that is placed in front of potentially-broken services, and not part of the services themselves (which, as with firewalls like ufw, makes it seem rather odd when the services themselves install rules). D-Bus system services already have total control over what requests they will accept from D-Bus peers, and if they rely on the XML rules to mediate that access, they're essentially rejecting that responsibility and hoping the dbus-daemon will protect them. The D-Bus maintainers would much prefer it if system services took responsibility for their own access control (with or without using polkit), because fundamentally the system service is always going to understand its domain and its intended security model better than the dbus-daemon can. Analogously, when a network service listens on all addresses and accepts requests from elsewhere on the LAN, we sometimes work around that by protecting it with a firewall, but the optimal resolution is to get that network service fixed to do proper authentication and access control instead. For system services, we continue to recommend essentially this "firewall" configuration, filling in the $ variables as appropriate:
<busconfig>
    <policy user="$ the daemon uid under which the service runs ">
        <allow own="$ the service's bus name "/>
    </policy>
    <policy context="default">
        <allow send_destination="$ the service's bus name "/>
    </policy>
</busconfig>
We discussed the possibility of moving towards a model where the daemon uid to be allowed is written in the .service file, together with an opt-in to "modern D-Bus access control" that makes the "firewall" unnecessary; after some flag day when all significant system services follow that pattern, dbus-daemon would even have the option of no longer applying the "firewall" (moving to an allow-by-default model) and just refusing to activate system services that have not opted in to being safe to use without it. However, the "firewall" also protects system bus clients, and services like Avahi that are not bus-activatable, against unintended access, which is harder to solve via that approach; so this is going to take more thought. For system services' clients that follow the "agent" pattern (BlueZ, polkit, NetworkManager, Geoclue), the correct "firewall" configuration is more complicated. At some point I'll try to write up a best-practice for these. New header fields for the system bus At the moment, it's harder than it needs to be to provide non-trivial access control on the system bus, because on receiving a method call, a service has to remember what was in the method call, then call GetConnectionCredentials() to find out who sent it, then only process the actual request when it has the information necessary to do access control. Allison and I had hoped to resolve this by adding new D-Bus message header fields with the user ID, the LSM label, and other interesting facts for access control. These could be "opt-in" to avoid increasing message sizes for no reason: in particular, it is not typically useful for session services to receive the user ID, because only one user ID is allowed to connect to the session bus anyway. Unfortunately, the dbus-daemon currently lets unknown fields through without modification. With hindsight this seems an unwise design choice, because header fields are a finite resource (there are 255 possible header fields) and are defined by the D-Bus Specification. The only field that can currently be trusted is the sender's unique name, because the dbus-daemon sets that field, overwriting the value in the original message (if any). To make it safe to rely on the new fields, we would have to make the dbus-daemon filter out all unknown header fields, and introduce a mechanism for the service to check (during connection to the bus) whether the dbus-daemon is sufficiently new that it does so. If connected to an older dbus-daemon, the service would not be able to rely on the new fields being true, so it would have to ignore the new fields and treat them as unset. The specification is sufficiently vague that making new dbus-daemons filter out unknown header fields is a valid change (it just says that "Header fields with an unknown or unexpected field code must be ignored", without specifying who must ignore them, so having the dbus-daemon delete those fields seems spec-compliant). This all seemed fine when we discussed it in person; but GDBus already has accessors for arbitrary header fields by numeric ID, and I'm concerned that this might mean it's too easy for a system service to be accidentally insecure: It would be natural (but wrong!) for an implementor to assume that if g_message_get_header (message, G_DBUS_MESSAGE_HEADER_FIELD_SENDER_UID) returned non-NULL, then that was guaranteed to be the correct, valid sender uid. As a result, fd.o #100317 might have to be abandoned. I think more thought is needed on that one. Unrelated topics As happens at any good meeting, we took the opportunity of high-bandwidth discussion to cover many useful things and several useless ones. Other discussions that I got into during the hackfest included, in no particular order: More notes are available from the GNOME wiki. Acknowledgements The GTK hackfest was organised by GNOME and hosted by Red Hat and Endless. My attendance was sponsored by Collabora. Thanks to all the sponsors and organisers, and the developers and organisations who attended.

18 January 2017

Reproducible builds folks: Reproducible Builds: week 90 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday January 8 and Saturday January 14 2017: Upcoming Events Reproducible work in other projects Reproducible Builds have been mentioned in the FSF high-priority project list. The F-Droid Verification Server has been launched. It rebuilds apps from source that were built by f-droid.org and checks that the results match. Bernhard M. Wiedemann did some more work on reproducibility for openSUSE. Bootstrappable.org (unfortunately no HTTPS yet) was launched after the initial work was started at our recent summit in Berlin. This is another topic related to reproducible builds and both will be needed in order to perform "Diverse Double Compilation" in practice in the future. Toolchain development and fixes Ximin Luo researched data formats for SOURCE_PREFIX_MAP and explored different options for encoding a map data structure in a single environment variable. He also continued to talk with the rustc team on the topic. Daniel Shahaf filed #851225 ('udd: patches: index by DEP-3 "Forwarded" status') to make it easier to track our patches. Chris Lamb forwarded #849972 upstream to yard, a Ruby documentation generator. Upstream has fixed the issue as of release 0.9.6. Alexander Couzens (lynxis) has made mksquashfs reproducible and is looking for testers. It compiles on BSD systems such as FreeBSD, OpenBSD and NetBSD. Bugs filed Chris Lamb: Lucas Nussbaum: Nicola Corna: Reviews of unreproducible packages 13 package reviews have been added and 13 have been removed in this week, adding to our knowledge about identified issues. 1 issue type has been added: Weekly QA work During our reproducibility testing, the following FTBFS bugs have been detected and reported by: diffoscope development Bugs in diffoscope in the last year Many bugs were opened in diffoscope during the past few weeks, which probably is a good sign as it shows that diffoscope is much more widely used than a year ago. We have been working hard to squash many of them in time for Debian stable, though we will see how that goes in the end reproducible-website development tests.reproducible-builds.org Misc. This week's edition was written by Ximin Luo, Chris Lamb and Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

2 January 2017

Ross Gammon: Happy New Year My Free Software activities in December 2016

So that was 2016! Here s a summary of what I got up to on my computer(s) in December, a check of how I went against my plan, and the TODO list for the next month or so. With a short holiday to Oslo, Christmas holidays, Christmas parties (at work and with Alexander at school, football etc.), travelling to Brussels with work, birthdays (Alexander & Antje), I missed a lot of deadlines, and failed to reach most of my Free Software goals (including my goals for new & updated packages in Debian Stretch the soft freeze is in a couple of days). To top it all off, I lost my grandmother at the ripe old age of 93. Rest in peace Nana. I wish I could have made it to the funeral, but it is sometimes tough living on the other side of the world to your family. Debian Ubuntu Other Plan status & update for next month Debian Before the 5th January 2017 Debian Stretch soft freeze I hope to: For the Debian Stretch release: Ubuntu Other

30 November 2016

Arturo Borrero Gonz lez: Creating a team for netfilter packages in debian

Debian - Netfilter There are about 15 Netfilter packages in Debian, and they are maintained by separate people. Yersterday, I contacted the maintainers of the main packages to propose the creation of a pkg-netfilter team to maintain all the packages together. The benefits of maintaining packages in a team is already known to all, and I would expect to rise the overall quality of the packages due to this movement. By now, the involved packages and maintainers are: We should probably ping Jochen Friedrich as well who maintains arptables and ebtables. Also, there are some other non-official Netfilter packages, like iptables-persistent. I m undecided to what to do with them, as my primary impulse is to only put in the team upstream packages. Given the release of Stretch is just some months ahead, the creation of this packaging team will happen after the release, so we don t have any hurry moving things now.

6 October 2016

Reproducible builds folks: Reproducible Builds: week 75 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday September 25 and Saturday October 1 2016: Statistics For the first time, we reached 91% reproducible packages in our testing framework on testing/amd64 using a determistic build path. (This is what we recommend to make packages in Stretch reproducible.) For unstable/amd64, where we additionally test for reproducibility across different build paths we are at almost 76% again. IRC meetings We have a poll to set a time for a new regular IRC meeting. If you would like to attend, please input your available times and we will try to accommodate for you. There was a trial IRC meeting on Friday, 2016-09-31 1800 UTC. Unfortunately, we did not activate meetbot. Despite this participants consider the meeting a success as several topics where discussed (eg changes to IRC notifications of tests.r-b.o) and the meeting stayed within one our length. Upcoming events Reproduce and Verify Filesystems - Vincent Batts, Red Hat - Berlin (Germany), 5th October, 14:30 - 15:20 @ LinuxCon + ContainerCon Europe 2016. From Reproducible Debian builds to Reproducible OpenWrt, LEDE & coreboot - Holger "h01ger" Levsen and Alexander "lynxis" Couzens - Berlin (Germany), 13th October, 11:00 - 11:25 @ OpenWrt Summit 2016. Introduction to Reproducible Builds - Vagrant Cascadian will be presenting at the SeaGL.org Conference In Seattle (USA), November 11th-12th, 2016. Previous events GHC Determinism - Bartosz Nitka, Facebook - Nara (Japan), 24th September, ICPF 2016. Toolchain development and fixes Michael Meskes uploaded bsdmainutils/9.0.11 to unstable with a fix for #830259 based on Reiner Herrmann's patch. This fixed locale_dependent_symbol_order_by_lorder issue in the affected packages (freebsd-libs, mmh). devscripts/2.16.8 was uploaded to unstable. It includes a debrepro script by Antonio Terceiro which is similar in purpose to reprotest but more lightweight; specific to Debian packages and without support for virtual servers or configurable variations. Packages reviewed and fixed, and bugs filed The following updated packages have become reproducible in our testing framework after being fixed: The following updated packages appear to be reproducible now for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 77 package reviews have been added, 178 have been updated and 80 have been removed in this week, adding to our knowledge about identified issues. 6 issue types have been updated: Weekly QA work As part of reproducibility testing, FTBFS bugs have been detected and reported by: diffoscope development A new version of diffoscope 61 was uploaded to unstable by Chris Lamb. It included contributions from: Post-release there were further contributions from: reprotest development A new version of reprotest 0.3.2 was uploaded to unstable by Ximin Luo. It included contributions from: Post-release there were further contributions from: tests.reproducible-builds.org Misc. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

12 September 2016

Reproducible builds folks: Reproducible Builds: week 72 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday September 4 and Saturday September 10 2016: Reproducible work in other projects Python 3.6's dictonary type now retains the insertion order. Thanks to themill for the report. In coreboot, Alexander Couzens committed a change to make their release archives reproducible. Patches submitted Reviews of unreproducible packages We've been adding to our knowledge about identified issues. 3 issue types have been added: 1 issue type has been updated: 16 have been have updated: 13 have been removed, not including removed packages: 100s of packages have been tagged with the more generic captures_build_path, and many with captures_kernel_version, user_hostname_manually_added_requiring_further_investigation, user_hostname_manually_added_requiring_further_investigation, captures_shell_variable_in_autofoo_script, etc. Particular thanks to Emanuel Bronshtein for his work here. Weekly QA work FTBFS bugs have been reported by: diffoscope development strip-nondeterminism development tests.reproducible-builds.org: Misc. This week's edition was written by Chris Lamb and Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC.

28 July 2016

Michael Prokop: systemd backport of v230 available for Debian/jessie

At DebConf 16 I was working on a systemd backport for Debian/jessie. Results are officially available via the Debian archive now. In Debian jessie we have systemd v215 (which originally dates back to 2014-07-03 upstream-wise, plus changes + fixes from pkg-systemd folks of course). Now via Debian backports you have the option to update systemd to a very recent version: v230. If you have jessie-backports enabled it s just an apt install systemd -t jessie-backports away. For the upstream changes between v215 and v230 see upstream s NEWS file for list of changes. (Actually the systemd backport is available since 2016-07-19 for amd64, arm64 + armhf, though for mips, mipsel, powerpc, ppc64el + s390x we had to fight against GCC ICEs when compiling on/for Debian/jessie and for i386 architecture the systemd test-suite identified broken O_TMPFILE permission handling.) Thanks to the Alexander Wirt from the backports team for accepting my backport, thanks to intrigeri for the related apparmor backport, Guus Sliepen for the related ifupdown backport and Didier Raboud for the related usb-modeswitch/usb-modeswitch-data backports. Thanks to everyone testing my systemd backport and reporting feedback. Thanks a lot to Felipe Sateler and Martin Pitt for reviews, feedback and cooperation. And special thanks to Michael Biebl for all his feedback, reviews and help with the systemd backport from its very beginnings until the latest upload. PS: I cannot stress this enough how fantastic Debian s pkg-systemd team is. Responsive, friendly, helpful, dedicated and skilled folks, thanks folks!

17 June 2016

Matthias Klumpp: A few words about the future of the Limba project

limba-smallI wanted to write this blogpost since April, and even announced it in two previous posts, but never got to actually write it until now. And with the recent events in Snappy and Flatpak land, I can not defer this post any longer (unless I want to answer the same questions over and over on IRC ^^). As you know, I develop the Limba 3rd-party software installer since 2014 (see this LWN article explaining the project better then I could do  ) which is a spiritual successor to the Listaller project which was in development since roughly 2008. Limba got some competition by Flatpak and Snappy, so it s natural to ask what the projects next steps will be. Meeting with the competition At last FOSDEM and at the GNOME Software sprint this year in April, I met with Alexander Larsson and we discussed the rather unfortunate situation we got into, with Flatpak and Limba being in competition. Both Alex and I have been experimenting with 3rd-party app distribution for quite some time, with me working on Listaller and him working on Glick and Glick2. All these projects never went anywhere. Around the time when I started Limba, fixing design mistakes done with Listaller, Alex started a new attempt at software distribution, this time with sandboxing added to the mix and a new OSTree-based design of the software-distribution mechanism. It wasn t at all clear that XdgApp, later to be renamed to Flatpak, would get huge backing by GNOME and later Red Hat, becoming a very promising candidate for a truly cross-distro software distribution system. The main difference between Limba and Flatpak is that Limba allows modular runtimes, with things like the toolkit, helper libraries and programs being separate modules, which can be updated independently. Flatpak on the other hand, allows just one static runtime and enforces everything that is not in the runtime already to be bundled with the actual application. So, while a Limba bundle might depend on multiple individual other bundles, Flatpak bundles only have one fixed dependency on a runtime. Getting a compromise between those two concepts is not possible, and since the modular vs. static approach in Limba and Flatpak where fundamental, conscious design decisions, merging the projects was also not possible. Alex and I had very productive discussions, and except for the modularity issue, we were pretty much on the same page in every other aspect regarding the sandboxing and app-distribution matters. Sometimes stepping out of the way is the best way to achieve progress So, what to do now? Obviously, I can continue to push Limba forward, but given all the other projects I maintain, this seems to be a waste of resources (Limba eats a lot of my spare time). Now with Flatpak and Snappy being available, I am basically competing with Canonical and Red Hat, who can make much more progress faster then I can do as a single developer. Also, Flatpaks bigger base of contributors compared to Limba is a clear sign which project the community favors more. Furthermore, I started the Listaller and Limba projects to scratch an itch. When being new to Linux, it was very annoying to me to see some applications only being made available in compiled form for one distribution, and sometimes one that I didn t use. Getting software was incredibly hard for me as a newbie, and using the package-manager was also unusual back then (no software center apps existed, only package lists). If you wanted to update one app, you usually needed to update your whole distribution, sometimes even to a development version or rolling-release channel, sacrificing stability. So, if now this issue gets solved by someone else in a good way, there is no point in pushing my solution hard. I developed a tool to solve a problem, and it looks like another tool will fix that issue now before mine does, which is fine, because this longstanding problem will finally be solved. And that s the thing I actually care most about. I still think Limba is the superior solution for app distribution, but it is also the one that is most complex and requires additional work by upstream projects to use it properly. Which is something most projects don t want, and that s completely fine.  And that being said: I think Flatpak is a great project. Alex has much experience in this matter, and the design of Flatpak is sound. It solves many issues 3rd-party app development faces in a pretty elegant way, and I like it very much for that. Also the focus on sandboxing is great, although that part will need more time to become really useful. (Aside from that, working with Alexander is a pleasure, and he really cares about making Flatpak a truly cross-distributional, vendor independent project.) Moving forward So, what I will do now is not to stop Limba development completely, but keep it going as a research project. Maybe one can use Limba bundles to create Flatpak packages more easily. We also discussed having Flatpak launch applications installed by Limba, which would allow both systems to coexist and benefit from each other. Since Limba (unlike Listaller) was also explicitly designed for web-applications, and therefore has a slightly wider scope than Flatpak, this could make sense. In any case though, I will invest much less time in the Limba project. This is good news for all the people out there using the Tanglu Linux distribution, AppStream-metadata-consuming services, PackageKit on Debian, etc. those will receive more attention  An integral part of Limba is a web service called LimbaHub to accept new bundles, do QA on them and publish them in a public repository. I will likely rewrite it to be a service using Flatpak bundles, maybe even supporting Flatpak bundles and Limba bundles side-by-side (and if useful, maybe also support AppImageKit and Snappy). But this project is still on the drawing board. Let s see  P.S: If you come to Debconf in Cape Town, make sure to not miss my talks about AppStream and bundling

27 April 2016

Niels Thykier: auto-decrufter in top 5 after 10 months

About 10 months ago, we enabled an auto-decrufter in dak. Then after 3 months it had become the top 11th remover . Today, there are only 3 humans left that have removed more packages than the auto-decrufter impressively enough, one of them is not even an active FTP-master (anymore). The current score board:
 5371 Luca Falavigna
 5121 Alexander Reichle-Schmehl
 4401 Ansgar Burchardt
 3928 DAK's auto-decrufter
 3257 Scott Kitterman
 2225 Joerg Jaspert
 1983 James Troup
 1793 Torsten Werner
 1025 Jeroen van Wolffelaar
  763 Ryan Murray
For comparison, here is the number removals by year for the past 6 years:
 5103 2011
 2765 2012
 3342 2013
 3394 2014
 3766 2015  (1842 removed by auto-decrufter)
 2845 2016  (2086 removed by auto-decrufter)
Which tells us that in 2015, the FTP masters and the decrufter performed on average over 10 removals a day. And by the looks of it, 2016 will surpass that. Of course, the auto-decrufter has a tendency to increase the number of removed items since it is an advocate of remove early, remove often! .:) Data is from https://ftp-master.debian.org/removals-full.txt. Scoreboard computed as:
  grep ftpmaster: removals-full.txt   \
   perl -pe 's/.*ftpmaster:\s+//; s/\]$//;'   \
   sort   uniq -c   sort --numeric --reverse   head -n10
Removals by year computed as:
 grep ftpmaster: removals-full.txt   \
   perl -pe 's/.* (\d 4 ) \d 2 :\d 2 :\d 2 .*/$1/'   uniq -c   tail -n6
(yes, both could be done with fewer commands)
Filed under: Debian

26 April 2016

Matthias Klumpp: A GNOME Software Hackfest report

Two weeks ago was the GNOME Software hackfest in London, and I ve been there! And I just now found the time to blog about it, but better do it late than never  . Arriving in London and finding the Red Hat offices After being stuck in trains for the weekend, but fortunately arriving at the airport in time, I finally made it to London with quite some delay due to the slow bus transfer from Stansted Airport. After finding the hotel, the next issue was to get food and a place which accepted my credit card, which was surprisingly hard in defence of London I must say though, that it was a Sunday, 7 p.m. and my card is somewhat special (in Canada, it managed to crash some card readers, so they needed a hard-reset). While searching for food, I also found the Red Hat offices where the hackfest was starting the next day by accident. My hotel, the office and the tower bridge were really close, which was awesome! I have been to London in 2008 the last time, and only for a day, so being that close to the city center was great. The hackfest didn t leave any time to visit the city much, but by being close to the center, one could hardly avoid the London experience  . Cool people working on great stuff towerbridge2016That s basically the summary for the hackfest  . It was awesome to meet with Richard Hughes again, since we haven t seen each other in person since 2011, but work on lots of stuff together. This was especially important, since we managed to solve quite some disagreements we had over stuff Richard even almost managed to make me give in to adding <kudos/> to the AppStream spec, something which I was pretty against supporting (it didn t make it yet, but I am no longer against the idea of having that the remaining issues are solvable). Meeting Iain Lane again (after FOSDEM) was also very nice, and also seeing other people I ve only worked with over IRC or bug reports (e.g. William, Kalev, ) was great. Also lots of new people were there, like guys from Endless, who build their low-budget computer for developing/emerging countries on top of GNOME and Linux technologies. It s pretty cool stuff they do, you should check out their website! (they also build their distribution on top of Debian, which is even more awesome, and something I didn t know before (because many Endless people I met before were associated with GNOME or Fedora, I kind of implicitly assumed the system was based on Fedora  )). The incarnation of GNOME Software used by endless looks pretty different from what the normal GNOME user sees, since it s adjusted for a different audience and input method. But it looks great, and is a good example for how versatile GS already is! And for upstream GNOME, we ve seen some pretty great mockups done by Endless too I hope those will make it into production somehow.
Ironically, a "snapstore" was close to the office ;-)

Ironically, a snapstore was close to the office ;-)

XdgApp and sandboxing of apps was also a big topic, aside from Ubuntu and Endless integration. Fortunately, Alexander Larsson was also there to answer all the sandboxing and XdgApp-questions. I used the time to follow up on a conversation with Alexander we started at FOSDEM this year, about the Limba vs. XdgApp bundling issue. While we are in-line on the sandboxing approach, the way how software is distributed is implemented differently in Limba and XdgApp, and it is bad to have too many bundling systems around (doesn t make for a good story where we can just tell developers ship as this bundling format, and it will be supported everywhere ). Talking with Alex about this was very nice, and I think there is a way out of the too-many-solutions dilemma, at least for Limba and XdgApp I will blog about that separately soon. On the Ubuntu side, a lot of bugs and issues were squashed and changes upstreamed to GNOME, and people were generally doing their best to reduce Richard s bus-factor on the project a little  . I mainly worked on AppStream issues, finishing up the last pieces of appstream-generator and running it against some sample package sets (and later that week against the whole Debian archive). I also started to implement support for showing AppStream issues in the Debian PTS (this work is not finished yet). I also managed to solve a few bugs in the old DEP-11 generator and prepare another release for Ubuntu. We also enjoyed some good Japanese food, and some incredibly great, but also suddenly very expensive Indian food (but that s a different story  ). The most important thing for me though was to get together with people actually using AppStream metadata in software centers and also more specialized places. This yielded some useful findings, e.g. that localized screenshots are not something weird, but actually a wanted feature of Endless for their curated AppStore. So localized screenshots will be part of the next AppStream spec. Also, there seems to be a general need to ship curation information for software centers somehow (which apps are featured? how are they styled? added special banners for some featured apps, app of the day features, etc.). This problem hasn t been solved, since it s highly implementation-specific, and AppStream should be distro-agnostic. But it is something we might be able to address in a generic way sooner or later (I need to talk to people at KDE and Elementary about it). In summary It was a great event! Going to conferences and hackfests always makes me feel like it moves projects leaps ahead, even if you do little coding. Sorting out issues together with people you see in person (rather than communicating with them via text messages or video chat), is IMHO always the most productive way to move forward (yeah, unless you do this every week, but I think you get my point  ). For me, being the only (and youngest ^^) developer at the hackfest who was not employed by any company in the FLOSS business, the hackfest was also motivating to continue to invest spare time into working on these projects. So, the only thing left to do is a huge shout out of THANK YOU to the Ubuntu Community Fund and therefore the Ubuntu community for sponsoring me! You rock! Also huge thanks to Canonical for organizing the sponsoring really quickly, so I didn t get into trouble with paying my flights.
Laney and attente walking on the Millennium Bridge after we walked the distance between Red Hat and Canonical's offices.

Laney and attente on the Millennium Bridge after we walked the distance between Red Hat and Canonical s offices.

To worried KDE people: No, I didn t leave the blue side I just generally work on cross-desktop stuff, and would like all desktops to work as well as possible

Next.

Previous.